首页> 外文OA文献 >Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables
【2h】

Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables

机译:距离矩阵的多元回归分析,用于测试基因表达模式与相关变量之间的关联

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A fundamental step in the analysis of gene expression and other high-dimensional genomic data is the calculation of the similarity or distance between pairs of individual samples in a study. If one has collected N total samples and assayed the expression level of G genes on those samples, then an N × N similarity matrix can be formed that reflects the correlation or similarity of the samples with respect to the expression values over the G genes. This matrix can then be examined for patterns via standard data reduction and cluster analysis techniques. We consider an alternative to conventional data reduction and cluster analyses of similarity matrices that is rooted in traditional linear models. This analysis method allows predictor variables collected on the samples to be related to variation in the pairwise similarity/distance values reflected in the matrix. The proposed multivariate method avoids the need for reducing the dimensions of a similarity matrix, can be used to assess relationships between the genes used to construct the matrix and additional information collected on the samples under study, and can be used to analyze individual genes or groups of genes identified in different ways. The technique can be used with any high-dimensional assay or data type and is ideally suited for testing subsets of genes defined by their participation in a biochemical pathway or other a priori grouping. We showcase the methodology using three published gene expression data sets.
机译:分析基因表达和其他高维基因组数据的基本步骤是计算研究中单个样本对之间的相似度或距离。如果收集了总共N个样本并分析了这些样本上G基因的表达水平,则可以形成一个N×N相似性矩阵,该矩阵反映出样本相对于G基因表达值的相关性或相似性。然后可以通过标准数据缩减和聚类分析技术检查该矩阵的模式。我们考虑了根据传统线性模型的传统数据约简和相似度矩阵聚类分析的替代方法。这种分析方法允许在样本上收集的预测变量与矩阵中反映的成对相似度/距离值的变化相关。拟议的多元方法避免了减少相似性矩阵的尺寸的需要,可用于评估用于构建矩阵的基因与在研究样品中收集的其他信息之间的关系,并可用于分析单个基因或组以不同方式鉴定的基因该技术可与任何高维测定或数据类型一起使用,非常适合测试由基因参与生化途径或其他先验分组而定义的基因子集。我们使用三个已公开的基因表达数据集展示了该方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号